2 research outputs found
An Improved Transformer-based Model for Detecting Phishing, Spam, and Ham: A Large Language Model Approach
Phishing and spam detection is long standing challenge that has been the
subject of much academic research. Large Language Models (LLM) have vast
potential to transform society and provide new and innovative approaches to
solve well-established challenges. Phishing and spam have caused financial
hardships and lost time and resources to email users all over the world and
frequently serve as an entry point for ransomware threat actors. While
detection approaches exist, especially heuristic-based approaches, LLMs offer
the potential to venture into a new unexplored area for understanding and
solving this challenge. LLMs have rapidly altered the landscape from business,
consumers, and throughout academia and demonstrate transformational potential
for the potential of society. Based on this, applying these new and innovative
approaches to email detection is a rational next step in academic research. In
this work, we present IPSDM, our model based on fine-tuning the BERT family of
models to specifically detect phishing and spam email. We demonstrate our
fine-tuned version, IPSDM, is able to better classify emails in both unbalanced
and balanced datasets. This work serves as an important first step towards
employing LLMs to improve the security of our information systems
Performance Analysis of Machine Learning Algorithm on Cloud Platforms: AWS vs Azure vs GCP
The significance of adopting cloud technology in enterprises is accelerating and becoming ubiquitous in business and industry. Due to migrating the on-premises servers and services into cloud, companies can leverage several advantages such as cost optimization, high performance, and flexible system maintenance, to name a few. As the data volume, variety, veracity, and velocity are rising tremendously, adopting machine learning (ML) solutions in the cloud platform bring benefits from ML model building through model evaluation more efficiently and accurately. This study will provide a comparative performance analysis of the three big cloud vendors: Amazon Web Service (AWS), Microsoft Azure and Google Cloud Platform (GCP) by building regression models in each of the platforms. For validation purposes, i.e., training and testing the models, five different standard datasets from the UCI machine learning repository have been employed. This work utilizes the ML services of AWS Sage maker, Azure ML Studio and Google Big Query for conducting the experiments. Model evaluation criteria here include measuring R-squared values for each platform, calculating the error metrics (Mean Squared Error, Mean Absolute Error, Root Mean Squared Error etc.) and comparing the results to determine the best performing cloud provider in terms of ML service. The study concludes with presenting a comparative taxonomy of regression models across the three platforms